Learning to Walk in Every Direction with the TBR-Learning algorithm

نویسندگان

  • Antoine Cully
  • Jean-Baptiste Mouret
چکیده

Legged robots are versatile machines that can outperform wheeled robots on rough terrain (Raibert, 1986), for instance in exploration or rescue missions. Their versatility is, however, tempered by their mechanical and control complexity, which makes them prone to mechanical damages and difficult to control robustly (Raibert, 1986; Bongard et al., 2006; Koos et al., 2013a). A promising way to compensate for these two weaknesses is to let robots discover on their own the best way to move in the current situation. A legged robot can thus cope with an unexpected terrain or with mechanical damages by learning a new walking gait (Bongard et al., 2006; Koos et al., 2013a), in the same way as animals can learn to limp with a sprained ankle. Reinforcement learning (Kohl and Stone, 2004; Tedrake et al., 2005) and evolutionary algorithms (Zykov et al., 2004; Chernova and Veloso, 2004; Hornby et al., 2005) have been investigated to discover walking gaits for physical robots. Nevertheless, most of these investigations are limited to straight, forward walking, whereas a robot that only walks along a straight line is obviously unable to accomplish any mission. Only a handful of works deal with controllers able to turn or to change the walking speed. In these cases, controllers are successively evaluated on each possible direction (Mouret et al., 2006), or learned with an incremental process (Kodjabachian and Meyer, 1998). Compared to learning a simple controller, these two approaches significantly increase the learning time and the complexity of the search process. In the present paper, we describe the Transferabilitybased Behavioral Repertoire Evolution (TBR-Evolution), a new learning algorithm that allows a robot to learn to walk in every direction in a single run of evolutionary algorithm. This algorithm combines the BR-Evolution algorithm (Cully and Mouret, 2013), which creates a behavioral repertoire in a single run, with the transferability approach (Koos et al., 2013b), which minimizes the number of evaluations on a physical robot when evolving controllers thanks to a simulator. A behavioral repertoire is a collection of simple controllers, where each of them reaches one position. An exter1 2 3 4 5

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evolving a Behavioral Repertoire for a Walking Robot

Numerous algorithms have been proposed to allow legged robots to learn to walk. However, most of these algorithms are devised to learn walking in a straight line, which is not sufficient to accomplish any real-world mission. Here we introduce the Transferability-based Behavioral Repertoire Evolution algorithm (TBR-Evolution), a novel evolutionary algorithm that simultaneously discovers several ...

متن کامل

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Region Directed Diffusion in Sensor Network Using Learning Automata:RDDLA

One of the main challenges in wireless sensor network is energy problem and life cycle of nodes in networks. Several methods can be used for increasing life cycle of nodes. One of these methods is load balancing in nodes while transmitting data from source to destination. Directed diffusion algorithm is one of declared methods in wireless sensor networks which is data-oriented algorithm. Direct...

متن کامل

Region Directed Diffusion in Sensor Network Using Learning Automata:RDDLA

One of the main challenges in wireless sensor network is energy problem and life cycle of nodes in networks. Several methods can be used for increasing life cycle of nodes. One of these methods is load balancing in nodes while transmitting data from source to destination. Directed diffusion algorithm is one of declared methods in wireless sensor networks which is data-oriented algorithm. Direct...

متن کامل

Multi-objective Differential Evolution for the Flow shop Scheduling Problem with a Modified Learning Effect

This paper proposes an effective multi-objective differential evolution algorithm (MDES) to solve a permutation flow shop scheduling problem (PFSSP) with modified Dejong's learning effect. The proposed algorithm combines the basic differential evolution (DE) with local search and borrows the selection operator from NSGA-II to improve the general performance.  First the problem is encoded with a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014